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ABSTRACT 



A queuing architecture and method for scheduling disk drive 
access requests in a video server. The queuing architecture 
employs at least two access request queues for each disk 
drive within a disk drive array, and a queue selector for 
selecting the first and second queues. The first queue is for 
disk access requests by steady -state users requesting new 
data streams who are currently viewing a program from the 
video server. The second queue is for all other types of disk 
access requests, including requests by new users, requests 
for loading content, disk maintenance, meta-data 
synchronizing, and the like. Steady-state disk access 
requests are serviced in order of ascending time deadlines. 
The queue selector gives highest priority to requests in the 
first queue, and requests from the second queue are serviced 
only upon a guarantee that all of the steady-state requests in 
the first queue will meet their time deadlines in the worst 
case access times for the disk drives. The second queue may 
comprise multiple queues to provide a queuing hierarchy 
including a third queue. Non -steady-state disk access 
requests are serviced on a first-in -first-out basis. The queue 
selector establishes priority in response to the location of the 
data upon a disk in a disk drive, and data may be stored in 
a disk drive based upon the queuing priority. The selecting 
and forwarding of the access requests to a disk drive may be 
performed repeatedly while an internet queue within a disk 
drive is not full, and there are outstanding access requests for 
that disk drive. 

22 Claims, 8 Drawing Sheets 
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QUEUING ARCHITECTURE INCLUDING A Disk Scheduling (SDS), which exploits the fact that disk 

PLURALITY OF QUEUES AND ASSOCIATED access times are on average significantly less than the worst 

METHOD FOR SCHEDULING DISK ACCESS case access time. The SDS finds use in improving video 

REQUESTS FOR VIDEO CONTENT server functionality by increasing the bandwidth utilization 

5 of the storage medium in the following manner: worst case 

The invention relates to methods of scheduling disk performance is used for priority operations (e.g., user read 

access requests in a video server, and, more particularly, to operations) but the bandwith created by better than worst 

statistical scheduling methods that improve the effective case performance is used for non-priority operations such as 

disk bandwidth provided by video servers. loading content onto the disk drives and disk maintenance. 

10 As a result, bandwidth for loading content and disk 

BACKGROUND OF THE DISCLOSURE maintenance, or file system meta-data synchronizing does 

Video-on-demand systems allow subscribers to request not { iave f t0 be specifically reserved, thus increasing the 

video programs from a video library at any time for imme- number of users that can be served simultaneously by the 

diate viewing in their homes. Subscribers submit requests to video server. 

a video service provider via a communication channel (e.g., 15 SDS maintains at least two queues and a queue selector, 

telephone lines or a back channel through the distribution The first queue is an access request queue for access requests 

network that carries the video to the subscriber's home), and from a current user that are presently viewing a program and 

the requested video program is routed to the subscriber's the second queue is for all other forms of access requests, 

home via telephone or coaxial television lines. In order to Tbe second queue may comprise multiple queues to provide 

provide such movie-on-demand services, video service pro- 20 a queuing hierarchy. The requests are ordered in each of the 

viders use a video server to process subscriber requests, queues to optimize the bandwidth and ensure that the data to 

retrieve the requested programs from storage, and distribute the. current users is not interrupted such that a display 

the programs to the appropriate subscriber(s). One exem- anomaly occurs. The queue selector identifies the queue that 

plary system for providing video-on-demand services is will supply the next access request to a disk queue. The 

described in commonly assigned U.S. Pat. No. 6,253,375, 25 selected requests are sent to the disk queues for execution, 

issued Jun. 26, 2001, which is incorporated herein by The disk queues are generally located on the disk drives and 

reference. are generally not accessible except to place a request in the 

In order for video servers to provide good performance, it for each disk ^* ^ a L e t^n executed on 

is crucial to schedule video storage (disk) access requests ln a fi rst " m > first -° ut (™°) manne '- In effect > ' he mention 

such that disk bandwidth is maximized. Also, once a sub- 30 defers l ° «•« * atest P osslbIe mo ^ 1 becau! * once 

scriber is watching a program, it is imperative to contimi- request is in the disk queue it is more difficult to change, 

ously deliver program content to the subscriber without The inventive queue structure provides opportumUes to alter 

interruption. In addition to distributing content to the fck access requests and their execuUon order prior to 

subscribers, disk bandwidth in a video server is typically „ sendmg the requests to the djsk queue. If a disk queue is not 

also required for operations such as loading content, disk 35 used the disk dnvc does not have ™ m ^ mi \^- 

maintenance, and file system meta-data synchronizing. Disk then the access requests are sent one at a tune from the SDS 

bandwidth may also be reserved for reducing latency in data t0 me ^ dnve for execution. 

transfer to subscribers. The number of subscribers that can ■ More specifically, the preferred embodiment of the SDS 

be properly served concurrently by a video server therefore maintains three queues for each disk based on the type and 

depends on effective disk bandwidth, which in turn depends priority of disk access requests, and a queue selector for 

on how disk access requests are scheduled. managing queue selection. Selected requests are forwarded 

One of the problems facing current disk scheduling meth- ^J? the three <5 ueues J° J? disk suc L h f 04 ' bandwidth 

ods is the potential variation in time required to service disk "tdizatura is maximized, while giving highest priority to 

accesses. For example, the internal transfer rate of a 45 subscribers currently viewing a program so .that their pro- 

SEAGATE CHEETAH® disk varies from 152 Megabits per e ra)D s , treams . ar6 g™«ally not interrupted. (Subscribers 

second (Mbps) on inner tracks to 231 Mbps on outer tracks, cu " en ^ viewing a program are referred to as "steady-state 

and the seek time can vary from 0 milliseconds (ms) to 13 subscribers.) SDS dynamically monitors bandwidth utihza- 

ms depending on how far apart the segments of data are from hon ,0 determine when lower-priority requests can be sched- 

one another. Given these variations in seek and transfer 50 ttled ^out affecting on-time completion of the higher 

times and the fact that the server may contain sixteen or P norlt y steady-state subscriber requests. In order to keep the 

more disk drives, it is difficult to determine the effective disk bus / and . maximize disk bandwidth utilization, disk 

bandwidth of a video server. As a result, current disk command queuing may be employed to ensure that the disk 

scheduling methods allocate a fixed amount of time for beg«n seeking for the next access immediately after it 

every disk access request, regardless of whether the access 55 fimshes ** data transfer for the access, 

finishes early. This results in a deterministic system in which Furthermore, popular content is migrated to the faster 

the available disk bandwidth is known, but since the fixed (outer) tracks of the disk drives to reduce the average access 

amount of time must be large enough to accommodate a time and improve performance. 
worst-case disk access, disk bandwidth is wasted. 

Therefore, there is a need in the art for a method and 60 

apparatus for scheduling disk access requests in a video The teachings of the present invention can be readily 

server without allocating worst-case access times, thus understood by considering the following detailed descrip- 

improving disk bandwidth utilization. lion in conjunction with the accompanying drawings, in 

which: 

SUMMARY OF THE INVENTION „ FIG. 1 depicts a high-level block diagram of a video-on- 

The disadvantages associated with the prior art are over- demand system that includes a video server incorporating 

come by a method of the present invention, called Statistical the present invention; 
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FIG. 2 depicts the queuing architecture of the Statistical The SDS 170 performs the method of the present inven- 

Disk Scheduler used to perform the method of the present tion. FIG. 2 depicts a logical representation of the queuing 

invention; architecture of the Statistical Disk Scheduler (SDS). In a 

FIG. 3 depicts a flowchart specification of the SDS physical representation, the outputs of each queue are con - 

Selection Procedure; 5 nected to the data loop (131 of FIG. 1). In the depicted 

n„ . . „ . . - c e u j v embodiment, the SDS q ueuing architecture contains three 

FIG 4 depicts a flowchart specification of the Scheduling queues for eicOisk^ and a queue selector205 for 

Interval Procedure; managing queue selection, i.e., the queue selector deter- 

F1G. 5 depicts a round-robin version of the Scheduling mines which queue is to transfer the next access request to 

Interval Procedure; ^ a disk drive. For simplicity, the logical representation is 

FIG. 6 depicts a flowchart specification of the Command more easily understandable. Although FIG. 2 depicts three ■ 

Completion Procedure; queues for each disk drive, a greater or lesser number of 

FIG. 7 depicts a flowchart of a method for prioritizing the queues may be used to fulfill the invention, i.e., at least two 

Scheduling Interval Procedure (FIGS. 4 or 5) and Command queues should be used; one for the "steady-state" requests 

Completion Procedure (FIG. 6) of the present invention; and 35 and one for all other requests. 

FIG. 8 shows the software process architecture for a In the three queue embodiment of the SDS 170, a steady- 
preferred multi-threaded implementation of the method of state subscriber queue (SSQ) 221j through 221„ 
the present invention. (collectively steady -state subscriber queues 221) is used for 

To facilitate understanding, identical reference numerals "steady-state" subscriber disk reads for active streams (i.e., 

have been used, where possible, to designate identical 20 continuous content retrieval for distribution to subscribers 

elements that are common to the figures. currently watching a program.) Disk access requests in SSQ 

221 are assigned the highest priority. A new subscriber 

DETAILED DESCRIPTION queU e (NSQ) 222 x through 222 n (collectively new sub- 

FIG. 1 depicts a video -on-demand system that utilizes a scriber queues 222) is for subscriber requests to begin 

video server incorporating the teachings of the present 2 s v i ewin g a program or perform other program related 

invention. Specifically, video-on-demand system 100 con- commands, i.e., non-steady state commands such as fast 

tains a video server 110 that communicates with a plurality forward or rewind that in essence are a request for a new data 

of disks 120! through 120„ (collectively disks 120) via a stream. Disk access requests in NSQ 222 are assigned 

Statistical Disk Scheduler (SDS) 170. In addition to the SDS medium priority. The other request queue (ORQ) 223 1 

170, video server 110 contains a central processing unit 30 through 223 n (collectively other request queues 223) is for 
. (CPU) 114 and memory element 117. SDS 170 is coupled to f all non-subscriber operations, such as loading content, disk 

disks 120 by paths 130 o through 130 M (collectively paths maintenance, and file system meta-data synchronizing. Disk 

130) (e.g., fiber channel), and to memory 117 by data path access requests in ORQ 223 are assigned the lowest priority. 

177. T he video s erver sends access requests along £athsl3Q Queues 221„, 222„, and 223„ are collectively called the 

to disksJ20, and eacrTdisin^~Ea^its own_intemal queue 35 SDS queues 200„, where n is an integer greater than zero that 

125 j through 125„ (collectively queue 125) for buffering represents a disk drive 12 0„, in an array of disk drives 120. 

access requests. Data read from the disks are transmitted For each disk 120 n , the queue selector 205 selects requests 

back to the video server along paths 130. The paths 130 are from the three SDS queues 221 rt , 222„, and 223„ and 

serially coupled, i.e., "daisy chained" to form a data transfer forwards the requests to the corresponding disk queue 125„ 

loop 131, e.g., a fiber channel loop. Although one loop is 40 via paths 251j to 251„, 252 1 to 252 n , and 253 1 to 253„, 

depicted, multiple loops may be employed to interconnect respectively. Each request has an associated worst-case 

subsets of the disk drives such that the data transfer rate access time based on the type of request and data transfer 

amongst the disk drives and the video server is increased size. The worst-case access time can be fixed, or dynarai- 

over that of a single loop system. The video server contains cally computed based on prior access time statistics, 

a Distribution Manager 180 that receives the data transmit- 45 Additionally, each steady-state subscriber request has a time 

ted along paths 130 and loop 131 and distributes this data to deadline for when the request must complete in order to 

subscribers 16(i 1 through 160 m (collectively subscribers guarantee continuous video for that subscriber. Disk 

160) via a transport network 140. Additionally, disks 120 requests in the NSQ and ORQ generally do not have time 

send messages called command completion messages (to be deadlines. 

discussed later) to the SDS 170 along paths 130. 50 Requests in the SSQ 221„ are ordered by lime deadline so 

The transport network 140 is typically, but not that the request at the front of the queue has the earliest 

exclusively, a conventional bi-directional hybrid fiber- deadline. Consecutive SSQ requests with the same time 

coaxial cable network. Subscribers 160 are coupled to the deadline are ordered by logical disk block address according 

transport network 140 by paths 150 1 through 150 m to an elevator algorithm. The elevator algorithm is a disk 

(collectively transport network paths 150) (e.g., coaxial ss scheduling algorithm well-known in the art in which the disk 

cable). Additionally, transport network 140 forwards sub- head travels in one direction over the disk cylinders until 

scriber access requests along path 175 to the SDS 170, and there are no more requests that can be serviced by continuing 

receives video data from Distribution Manager 180 via path in that direction. At this point, the disk head changes 

185. direction and repeats the process, thus traveling back and 

Commonly assigned U.S. Pat. No. 6,253,375, issued Jun. 60 forth over the disk cylinders as it services requests. Since 

26, 2001, which is incorporated herein by reference, requests in the NSQ and ORQ do not generally have 

describes an information distribution system, known as the deadlines, they may be ordered on a first come first serve 

ONSET™ system, which uses a video server that may basis, or according to some other desired priority scheme, 

benefit from the present invention. Additionally, the video In order to keep the disks 120 busy and maximize disk 

server of the ONSET™ system is described in U.S. Pat. Nos. 65 bandwidth utilization, disk command queuing may be 

5,671,377 and 5,581,778 which are both herein incorporated employed to ensure that the disk can begin the seek for the 

by reference. next access immediately after it finishes the data transfer for 
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the current disk access. When a steady-state request needs to 
access a sequence of multiple disks, the request is initially 
added to the SSQ 221 2 of the first disk 120 r After this 
request is selected for servicing by the first disk 120 1 , the 
request is added to the second disk's SSQ 221 2 as soon the 
video server begins sending the data that was recalled from 
the first disk 120 n to the subscriber. Steady-state requests are 
similarly added to the SSQ 221 n of each successive disk 
120„. 

The queue selector 205 employs an SDS Selection Pro- 
cedure to select requests from the three SDS queues 200 n 
and forward the requests to an associated disk queue 125 n 
located within each of the disk drives 120„. The SDS 
Selection Procedure uses worst-case access times, request 
priorities, and time deadlines in determining which request 
to forward to the disk queue. The general strategy of the SDS 
Selection Procedure is to select a non-SSQ request only 
when such a selection will not cause any of the SSQ 221 M 
requests to miss their time deadlines, even if the non-SSQ 
request and all requests in the SSQ 221„ were to take their 
worst-case access times. If such a guarantee cannot be made, ' 
then the first request in the SSQ is always selected. As an J 
optional step, once a request is selected, the SDS Selection . 
Procedure checks whether the data for the selected read, 
request is already in cache (if caching is used). If this is the, 25 
case, the disk access request can be discarded and the r 
Selection Procedure is repeated. Otherwise, the selected 
request is removed from the SDS queue 221„ and forwarded 
to an associated disk queue 125„. 

FIG. 3 depicts a flow diagram of the SDS Selection 
Procedure 300. First, the Selection Procedure checks 
whether the first entry in the NSp can be selected while 
guaranteeing that all SSQ requests will meet their time 
deadlines in the worst case (step 320), where worst case is 
defined by the system. Generally, the worst case value is the 
access value having a per user error rate that is acceptable. 

Each queue maintains "a sum of the worst case values" *f 
selector that performs a worst case analysis and selects the 
queue that will be used (in steps 320 and 330 described 
below) to send the next command to the disk drive. The 
following pseudocode represents the operation of such a 
selector. 

1) perform worst case analysis 
returns remaining time (the amount of time left on the 

SSQ if all commands take worst case time to 
execute, if the SSQ is empty, the remaining time is 
infinity) 

2) if NSQ is !empty && NSQ.head.worstcase<remaining 
time 

take request off NRQ 
else if NSQ is empty && ORQ is lempty && 

ORQ.head.worstcase<remaining time 

take request off ORQ 
else if SSQ is lempty 

take request off SSQ 

if request.deadline-request.worstcase>current time 
request missed deadline, terminate request, try selec- 
tor again 

else 

no requests pending 
Preference is given to the NRQ over the ORQ, only take 
requests off the ORQ if the NSQ is empty. 

The ORQ.head.worstcase and NSQ.head.worstcase are 
the respective worstcase access times to fulfill the next 
request in the ORQ and NSQ. The "remaining time" value 
is computed as follows: 



30 



35 



40 



45 



55 



60 



65 



remaining time=disk Q Remaining Time (SSQ„)-disk Q 

worst case (PQ„) 
disk Q Remaining Time (Q, now) { 

sum-0 

min=MAX 

for each entry in Q { 

sum+=entry-* worstcase 

left-entry-»deadline+sum-now; 

if (left<=0 ||entry-*deadline>now) { /* 

out of time */ 

min«=0; 

break; 

} 

if (min>left) 

min=left; /* there is now less time remaining 

*/ 

} 

return min; 

} 

The worst case access time value may be dynamically 
computed or empirically measured to be a cut off time that 
defines a period in which accesses have an acceptable error 
rate. If the first entry fulfills the requirement in step 320, then 
this first entry is selected (step 340); otherwise, the Selection 
Procedure checks whether the first entry in the ORQ can be 
selected while guaranteeing that all SSQ requests will meet 
their time deadlines in the worst case (step 330). If so, then 
this first entry is selected (step 350); otherwise, the proce- 
dure proceeds to step 315, wherein the procedure queries 
whether the first entry in the SSQ can be executed within its 
time deadline assuming the worst case access time. If the 
request cannot be executed in time, the request is discarded 
at step 325 and the procedure returns to step 320. 

If, however, the request can be executed in the allotted 
time in step 315, the first entry of the SSQ is selected at step 
360. The selected request is then removed from its queue 
(step 370). Furthermore, if caching is used, the Selection 
Procedure checks whether data for the selected request is 
already in cache (step 380) (the caching step 380 is shown 
in phantom to represent that it is an optional step). If the 
request is cached, the selected request is discarded and the 
Selection Procedure is repeated (step 382). Otherwise, the 
selected request is forwarded to the associated disk queue 
(step 390). 

The SDS executes the Selection Procedure during two 
scheduling events, called the scheduling interval and the 
command completion event. The scheduling interval is a 
fixed, periodic interval, while a command completion event 
occurs every time one of the disks completes a command. 
(Note that it is possible, although highly unlikely, that 
multiple disks complete a command simultaneously at a 
command completion event.) At each scheduling interval, a 
procedure called the Scheduling Interval Procedure is 
executed, and at each command completion event, a proce- 
dure called the Command Completion Procedure is 
executed. In the case that a scheduling interval and a 
command completion coincide, the Command Completion 
Procedure is executed first (i.e., the Command Completion 
Procedure is given priority over the Scheduling Interval 
Procedure). Alternatively, if the disk queue has a depth that 
is greater than one, then the execution priority of these 
routines is reversed. Such reversal leaves more time avail- 
able to do other operations. 

In the Scheduling Interval Procedure, steady-state 
requests are added to the next SSQ, if possible. (Recall that 
a steady-state request can be added to the next SSQ as soon 
as the data is output from the video server to the subscriber), 
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and all SSQs are reordered to maintain correct time deadline 
order. The first entries in each of the SSQs are then sorted 
based on time deadlines, which determines the order with 
which the disks are serviced. For each disk, the Selection 
Procedure 300 is repeatedly executed as long as the asso- 
ciated disk queue is not full, at least one of the three SDS 
queues (SSQ, NSQ, ORQ) is not empty, and there is a 
request in one of the three SDS queues that satisfies the 
Selection Procedure criteria. For example, if in a three-Disk 
system when the disk queues are not full the first entry in 
Disk l's SSQ has a time deadline of 35, the first entry in 
Disk 2's SSQ has a time deadline of 28, and the first entry 
in Disk 3's SSQ has a time deadline of 39, then the disks 
would be serviced in the following order. Disk 2, Disk 1, 
Disk 3. Once the disk order has been established, then the l 
SDS Selection Procedure is performed for each disk in that 
order. 

Generally, in a video server application, the extents for the & 
data are very long (e.g., hundreds of kilobytes) such that the 
disk queues have a depth of one. In other applications using 
shorter data extents, the disk queues may have various 
depths, e.g., five requests could be stored and executed in a 
first-in, first-out (FIFO) manner. The extent size is inversely 
proportioned to disk queue depth where data delivery 
latency is the driving force that dictates the use of a large 
extent size for video server applications. For other applica- 
tions where the extent size is relatively small, the disk queue 
depth is dictated by the desire to reduce disk drive idle time. 

FIG. 4 shows a formal specification of the Scheduling 
Interval Procedure 400 in flowchart form. First, the Sched- 
uling Interval Procedure adds steady-state requests to the 
appropriate SSQs, if possible (step 420), and reorders all the 
SSQs by time deadlines (step 430). The Scheduling Interval 
Procedure (400) then groups all the disks, where "D"=a set 
of all of the disks (step 440). The disk that has the earliest 
deadline for the first entry in its SSQ is then selected (step 
450). The Selection Procedure is performed for the selected 
disk (step 300), and then the Scheduling Interval Procedure 
checks whether a request satisfying the Selection Procedure 
criteria was selected (step 460). If not, the disk with the next 
earliest deadline for the first entry in its SSQ is selected 
(steps 475, 480, 450), and the Selection Procedure is 
repeated for this disk (step 300). Otherwise, the Scheduling 
Interval Procedure checks whether the selected disk's queue 
is full (step 465), or if all three SDS queues for the selected 
disk are empty (step 470). If either of these conditions are 
true, then the disk with the next earliest deadline for the first 
entry in its associated SSQ is selected (steps 475, 480, 450) 
and the Selection Procedure is repeated for this disk (step 
300). If, however, both conditions are false, the Selection 
Procedure 300 is repeated for the same selected disk. Thus, 
the disks are processed sequentially, ordered by the corre- 
sponding SSQ's first deadline, where "processing" means 
that the Selection Procedure is invoked repeatedly until the 
disk queue is full or there are no more requests for that disk. 

As disclosed in FIG. 4, the Scheduling Interval Procedure 
fills each of the disk queues one at a time, which is most 
efficient for small disk queues. In the preferred embodiment, 
a small disk queue is used, as it facilitates the latency 
reduction. In particular, as soon as the servicing of a request 
extends past its worst-case access time, the request is 
aborted by the SDS, i.e., the SDS "times-out" waiting for the 
request to be serviced and then moves to the next procedural 
step. To assist in error handling when using a disk queue 
with a depth that is greater than one, such that the server may 
determine which request was not fulfilled within a pre- 
defined time period, the server maintains a disk mimic queue 
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that mimics the content of the disk queue of each of the disk 
drives. As such, the server can poll the mimic queue to 
determine the nature of the errant request and send an 
"abort" command to the disk drive for that request. The disk 
drive will then process the next request in the disk queue and 
the server updates the mimic queue. 

In the case of large disk queues, however, filling the disk 
queues in a round-robin fashion may be more efficient. A 
round-robin version of the Scheduling Interval Procedure for 
large disk queues is shown in FIG. 5. As in the previous 
embodiment of the Scheduling Interval Procedure, steady- 
state requests are first added to the appropriate SSQs (step 
520), and disks are ordered by the deadlines of the first entry 
in each disk's SSQ. In this round-robin version, however, the 
Selection Procedure is executed only once for a disk, and 
then the next disk is selected. Once all disks have been 
selected, the round-robin Scheduling Interval Procedure 
goes through each of the disks once again in the same order, 
executing the Selection Procedure once per disk. This pro- 
cess is continued until no more requests can be added to any 
of the disk queues. 

Specifically, a vector D is defined as an ordered list of all 
the disks, where the order is based on the time deadlines of 
the first entry in each disk's SSQ (step 530). A Boolean 
variable SELECT is initialized to false, and an integer 
variable i is initialized to 1 (step 540). The following 
condition is then tested: if i-n+1 and SELECT-false (step 
550). As will be seen shortly, this condition will only be true 
when all of the disks have been selected and no requests 
could be added to any of the disk's queues. Next (step 555), 
if i=n+l (i.e., the last disk had been selected in the previous 
iteration), then i is set to 1 (start again with the first disk). If 
disk D/s disk queue is full (step 560), or all three of D/s 
SDS queues are empty (step 570), then the next disk is 
selected (step 585). The Selection Procedure is performed 
for D ( - (step 300), and if a request satisfying the Selection 
Procedure criteria was found, SELECT is set to true (step 
580), and the next disk is selected (step 585). Thus the 
SELECT variable indicates whether a request was added to 
one of the disk queues during a pass over the vector of disks. 

The Command Completion Procedure is executed, on a 
first-in, first -out basis, every time a disk completes a com- 
mand. Thus, for each completed command, the Command 
Completion Procedure executes in the order in which the 
commands are completed, i.e., using a FIFO command 
handling step. As shown in FIG. 6, the Command Handling 
Procedure begins at step 610, proceeds to the FIFO com- 
mand handling step 605 and ends at step 690. 

Alternatively, the procedure can be adapted to handle 
simultaneous command events. In this procedure, it is first 
determined if multiple disks have completed a command 
simultaneously at the command completion event. (Most 
likely only one disk will have completed a command at the 
command completion event, but the multiple-disk situation 
is possible.) If more than one disk has completed a 
command, then the first entries in the SSQs of these disks are 
sorted based on time deadlines, thereby determining the 
order in which the disks are serviced. Once the disk order 
has been established, the SDS Selection Procedure is per- 
formed for each disk in order in the same manner as the 
Scheduling Interval Procedure (FIG. 3). That is, for each 
disk, the Selection Procedure is repeatedly executed as long 
as the associated disk queue is not full, at least one of the 
three SDS queues (SSQ, NSQ, ORQ) is not empty, and there 
is a request in one of the three SDS queues that satisfies the 
Selection Procedure criteria. 

A formal specification of both forms of the Command 
Completion Procedure is shown in flowchart form in FIG. 6. 
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Step 605 represents the standard FIFO command handling 
procedure, while the dashed box 615 represents an alterna- 
tive procedure capable of handling simultaneous command 
occurrences. In this alternative version, the Command 
Completion Procedure 600 determines which disks have just s 
completed a command (step 640), and the disk that has the 
earliest deadline for the first entry in its SSQ is then selected 
(step 650). Just as in the Scheduling Interval Procedure, the 
Selection Procedure is performed for the selected disk (step 
300), and then the Command Completion Procedure (step 10 
600) checks whether a request satisfying the Selection 
Procedure criteria was selected (step 660). If not, the disk 
with the next earliest deadline for the first entry in its SSQ 
is selected (steps 675, 680, 650) and the Selection Procedure 
is repeated for this disk (step 300). Otherwise, the Command 15 
Completion Procedure checks whether the selected disk's 
queue is full, or if all three SDS queues for the selected disk 
are empty. If either of these conditions are true, then the disk 
with the next earliest deadline for the first entry in its SSQ 
is selected (steps 675, 680, 650) and the Selection Procedure 20 
is repeated for this disk (step 300). If, however, both 
conditions are false, the Selection Procedure is repeated for 
the same selected disk. 

As disclosed in FIG. 6, the Command Completion Pro- 
cedure fills each of the disk queues one at a time, i.e., the 25 
disk with a complete event is refilled. Note that since it is 
highly unlikely that more than one disk is serviced on a 
command completion event, the choice of whether to 
employ round-robin or sequential filling of the disk queues 
in the Command Completion Procedure has essentially no 30 
impact on performance. 

In both the Scheduling Interval and Command Comple- 
tion Procedures, the ordering of requests within the disk* 
queues are managed by the video server central processing 
unit (CPU), and not the disks themselves. (Any reordering 35 
operations normally performed by the disk must be 
disabled.) While reordering by the disks would improve the 
average seek time, managing the disk queues by the CPU is 
required to preserve the time deadlines of the user requests. 

FIG. 7 depicts a flowchart of a method 700 for prioritizing 40 
the Scheduling Interval Procedure (FIGS. 4 or 5) and 
Command Completion Procedure (FIG. 6) of the present 
invention. The method 700 starts at step 710 and proceeds to 
step 720. In step 720, it is determined whether a command 
completion event has occurred. Whenever a command 45 
completion event occurs, the Command Completion Proce- 
dure is invoked (step 600). Furthermore, in step 730, it is 
determined whether a scheduling interval has occurred. If, in 
step 730, a scheduling interval occurs, the Scheduling Inter- 
val Procedure is invoked (step 400 or 500). As shown in 50 
FIG. 7, if both a scheduling interval and a command 
completion event occur simultaneously, the command 
completion is given priority and the Command Completion 
Procedure (step 600) is executed first. Alternatively, as 
discussed above, when a disk queue having a depth that is 55 
greater than one is used, the execution priority for these 
procedures is reversed. 

In a preferred embodiment, the method of the present 
invention is implemented as a multi-threaded process. FIG. 
8 shows the software process architecture 800 for the 60 
preferred embodiment. The media control thread 810 
receives new-subscriber request messages from the transport 
network 140 and path 175, and forwards these requests 
through message queues 815 to the T, loop thread 820. The 
T s loop thread 820 is a top level scheduler responsible for 65 
two primary functions: first, it maintains all state informa- 
tion necessary to communicate with the disk interfaces 835 
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(835, to 835„) and video server memory 840; second, it 
performs the Scheduling Interval Procedure using a period 
of, for example, 100 ms. The T s Loop thread 820 allocates 
the commands to the SDS queues 825, where each disk drive 
is associated with a set of queues (e.g., SSQ, NSQ and other 
queues) generally shown as queues 825 0 , 825.,, . . . 825^. At 
the startup condition, when the disks are idle, the initial 
commands (startup commands) from the T, loop thread 820 
are sent from the SDS queues 825 directly to the disk 
interfaces 835. Under steady-state operation, a response 
thread 830 communicates the commands from the SDS 
queues 825 to the disk drive interfaces 835. Each interface 
835 communicates to individual disk drives through a fiber 
channel loop. Response thread 830 also receives command 
completion messages from the disk interfaces 835. Upon 
receiving- these messages the response thread performs the 
Command Completion Procedure (step 600). Media control 
thread 810, T s loop thread 820, and response thread 830 are 
all executed by video server CPU 114 of FIG. 1. 

While this invention has been particularly shown and 
described with references to a preferred embodiment 
thereof, it will be understood by those skilled in the art that 
various changes in form and details may be made therein 
without departing from the spirit and scope of the invention 
as defined by the appended claims. 

What is claimed is: 

1. A queuing architecture for scheduling disk drive access 
requests in an information server, comprising, for each disk 
drive associated with said information server: 

a first queue for disk access requests from users currently 
receiving information provided by the information 
server; 

a second queue for all other disk access requests; and 
a queue selector for selecting requests from said first and 
second queues and forwarding said requests to a disk 
drive associated with the information server. 

2. The queuing architecture of claim 1, wherein said 
information server determines worst-case disk access times 
for each request and whether requests in said first and second 
queues can be executed within the time deadlines of the first 
queue, assuming said worst -case disk access times. 

3. The queuing architecture of claim 1, where requests in 
said first queue each have an associated time deadline. 

4. The queuing architecture of claim 1, where requests in 
said first queue are ordered from front to back by ascending 
time deadlines. 

5. The queuing architecture of claim 1, where said queue 
selector gives highest priority to requests in said first queue 
and lower priority to requests in said second queue. 

6. The queuing architecture of claim 1, wherein said 
information server is a video server and said information is 
a video program. 

7. A queuing architecture for scheduling disk drive access 
requests in an information server, comprising, for each disk 
drive associated with said information server: 

a first queue for disk access requests from users currently 
receiving information provided by the information 
server; 

a second queue for all disk access requests from users 

requesting new data streams; 
a third queue for all other disk access requests; and 
a queue selector for selecting requests from said first, 
second, and third queues and forwarding said requests 
to a disk drive associated with the information server. 

8. The queuing architecture of claim 7, wherein said 
information server determines worst-case disk access times 



09/10/2003, EAST Version: 1.04.0000 



US 6,378,036 B2 



11 



12 



for each request, and whether requests in said first and 
second queues can be executed within the time deadlines of 
the first queue, assuming said worst-case disk access times. 

9. The queuing architecture of claim 7, where requests in 
said first queue each have an associated time deadline. 5 

10. The queuing architecture of claim 7, where requests in 
said first queue are ordered from front to back by ascending 
time deadlines. 

11. The queuing architecture of claim 7, where said queue 
selector gives highest priority to requests in said first queue, 10 
a lower priority to requests in said second queue and a 
lowest priority to requests in said third queue. 

12. The queuing architecture of claim 7, where said third 
queue contains requests for loading content and disk main- 
tenance. 15 

13. The queuing architecture of claim 7, wherein said 
information server is a video server and said information is 
a video program. 

14. A method of scheduling access requests for a disk 



otherwise, selecting a next other access request, if such 
selection does not cause any steady-state access request 
to miss its associated time deadline within which that 
steady-state access request must be completed, to 
ensure that the program being viewed is not inter- 
rupted; 

otherwise, selecting a steady-state access request with the 
earliest associated time deadline within which that 
steady-state access request must be completed, to 
ensure that the program being viewed is not inter- 
rupted; and 

forwarding the selected request to a disk drive of said 
plurality of disk drives that is associated with the 
selected request. 
19. The method of claim 18, where the selecting and 
forwarding steps are performed repeatedly while an internal 
queue of the disk drive associated with the selected request 
is not full, and there are outstanding access requests for the 



drive in an information server, said method comprising the 20 disk assoc j a ted with the selected request. 



25 



30 



steps of: 

(a) providing at least two queues, where a first queue 
contains steady-state access requests from users being 
supplied information from the information server and a 
second queue for all other types of access requests; 

(b) selecting an access request from said second queue, if 
such selection does not cause a steady -state access 
request to miss a time deadline within which the 
steady-state access request must be completed to ensure 
that the information being viewed is not interrupted; 

(c) otherwise, selecting one of the steady-state access 
requests; and 

(d) forwarding the selected request to said disk drive. 

15. The method of claim 14 wherein the selecting and 35 
forwarding steps are performed repeatedly while an internal 
queue within the disk drive is not full, and there are 
outstanding access requests for said disk drive. 

16. The method of claim 14 further comprising the steps 

of checking, before forwarding the selected request to said 40 
disk drive, if data for the selected request is in a cache, and 
discarding the selected request if the data is in said cache. 

17. The method of claim 14 wherein the steady-state 
access request selected in step c is the steady-state access 
request with the earliest time deadline. 45 

18. A method of scheduling access requests for a plurality 
of disk drives associated with a video server, said method 
comprising the steps of: 

ordering said disk drives by earliest time deadline of 
steady-state access requests for each of said plurality of 50 
disk drives; and 

for each of said plurality of disk drives, 

providing three queues, where a first queue contains 
steady state access requests from users being supplied 
a program from the video server, a second queue 
contains new programming access requests and a third 
queue for all other types of access requests; 

selecting a next new programming access request, if such 
selection does not cause any steady-state access request 
to miss its associated time deadline within which that 
steady-state access request must be completed, to 
ensure that the program being viewed is not inter- 
rupted; 
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20. A method of scheduling access requests for a plurality 
of disk drives associated with a video server, said method 
comprising the steps of: 

providing, for each of the disk drives associated with said 
plurality of disk drives, three queues, where a first 
queue contains steady-state access requests from users 
being supplied a program from the video server, a 
second queue contains new programming access 
requests and a third queue for all other types of access 
requests; 

determining which of said disk drives has completed a 
command; 

ordering said disk drives that have completed a command 
by earliest time deadline of steady-state access requests 
for said disk drives; and 

for each of said disk drives that has completed a 
command, 

selecting the next new programming request, if such 
selection does not cause any steady-state access request 
to miss its associated time deadline within which that 
steady-state access request must be completed, to 
ensure that the program being viewed is not inter- 
rupted; 

otherwise, selecting a next other type access request, if 
such selection does not cause any steady-state access 
request to miss its associated time deadline within 
which that steady-state access request must be 
completed, to ensure that the program being viewed is 
not interrupted; 

otherwise, selecting the steady-state access request with 
the earliest associated time deadline; and 

forwarding the selected request to a disk drive of said 
plurality of disk drives that is associated with the 
selected request. 

21. The method of claim 20, where said method is 
performed every time one of said disk drives has completed 
a command. 

22. The method of claim 20, where said method is 
performed within a scheduling interval. 
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